Speaker Discrimination on Broadcast News and Telephonic Calls Based on New Fusion Techniques

نویسندگان

  • Halim Sayoud
  • Siham Ouamour
چکیده

This chapter describes a new Speaker Discrimination System (SDS), which is a part of an overall project called Audio Documents Indexing based on a Speaker Discrimination System (ADISDS). Speaker discrimination consists in checking whether two speech segments come from the same speaker or not. This research domain presents an important field in biometry, since the voice remains an important feature used at distance (via telephone). However, although some discriminative classifiers do exist nowadays, their performances are not enough sufficient for short speech segments. This issue led us to propose an efficient fusion between such classifiers in order to enhance the discriminative performance. This fusion is obtained, by using three different techniques: a serial fusion, parallel fusion and serial-parallel fusion. Also, two classifiers have been chosen for the evaluation: a mono-gaussian statistical classifier and a Multi Layer Perceptron (MLP). Several experiments of speaker discrimination are conducted on different databases: Hub4 Broadcast-News and telephonic calls. Results show that the fusion has efficiently improved the scores obtained by each approach alone. So, for instance, the authors got an Equal Error Rate (EER) of about 7% on a subset of Hub4 Broadcast-News database, with short segments of 4 seconds, and an EER of about 4% on telephonic speech, with medium segments of 10 seconds. DOI: 10.4018/978-1-60960-563-6.ch017

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Discrimination on Broadcast News and Telephonic Calls Using a Fusion of Neural and Statistical Classifiers

This article describes a new Speaker Discrimination System (SDS), which is a part of an overall project called Audio Documents Indexing based on a Speaker Discrimination System (ADISDS). Speaker discrimination consists in checking whether two speech segments come from the same speaker or not. This research domain presents an important field in biometry, since the voice remains an important feat...

متن کامل

Audio-visual speaker recognition for video broadcast news: some fusion techniques

Audio-based speaker identi cation degrades severely when there is a mismatch between training and test conditions either due to channel or noise. In this paper, we explore various techniques to fuse video based speaker identi cation with audio-based speaker identi cation to improve the performance under mismatched conditions. Speci cally, we explore techniques to optimally determine the relativ...

متن کامل

Unsupervised speaker segmentation of broadcast news using MDL-based Gaussian model

This paper proposes an approach for unsupervised speaker segmentation and gender discrimination of broadcast news. In this paradigm, a speaker segmentation mechanism using MDL-based Gaussian model is firstly adopted to determine the speaker changes using mean and covariance of the Gaussian model. These speaker segments partitioned by speaker changes are smoothed and discriminated into male or f...

متن کامل

Speaker Diarization: From Broadcast News to Lectures

This paper presents the LIMSI speaker diarization system for lecture data, in the framework of the Rich Transcription 2006 Spring (RT-06S) meeting recognition evaluation. This system builds upon the baseline diarization system designed for broadcast news data. The baseline system combines agglomerative clustering based on Bayesian information criterion with a second clustering using state-of-th...

متن کامل

UCBN: A new audio-visual broadcast news corpus for multimodal speaker verification studies

The performance of face, voice, and multimodal speaker verification systems in complex and non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust multi-modal speaker recognition systems, a new multi-modal (audio-visual) Australian broadcast UCBN (University of Canberra Broadcast News) corpus was...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015